# Web Design Prototypicality Dataset

## General Description

The dataset contains webpage screenshots (always homepages, always full-page, always in *.jpeg) and their user ratings. The dataset was collected for studying   design prototypicality - a design feature that characterizes how much/little a design visually differs from the average/canonical/typical design for a Web domain. 

The dataset could be re-used for studying other user-perception related phenomena and design dimensions, and for computational analyses of designs (e.g., using image-processing algorithms to estimate webpage visual aesthetics).

---

## Participant data

Each participant evaluated 1 dimension for multiple screenshots. Each screenshot was automatically scrolled down to ensure a participant saw it entirely before rating. The three prototypicality related dimensions - Exemplar Goodness, Family Resemblance, and Typicality - have ~7 ratings per screenshot and were collected using Likert-type scales. All other dimensions have ~ 4.5 ratings per screenshot and were collected using semantic-differential scales. All scales were 7-point. Before averaging, ratings were scaled within each participant (to account for the different use of scales)

The tab-separated datafiles (all ratings.avg.*.txt) with average user scores have the following columns:

 - EXMPL - Exemplar Goodness (This webpage is a representative example of a homepage of online-shopping websites)
 - AVG - Family Resemblance (This webpage has many visual aspects in common with homepages of other online-shopping websites)
 - TYP - Typicality (This webpage looks like a typical homepage of an online-shopping website)

 - TRU - Trustworthiness (This webpage looks; Not trustworthy/Trustworthy)
 - AE - Visual Aesthetics (This webpage looks; Ugly/Beautiful)
 - US - Pre-use Usability (This webpage looks; Easy to use/Difficult to use)

The tab-separated datafiles with raw user scores (ratings.raw.*.txt) contain raw user scores, including the scores of crowdworkers (sessions, really, since crowdworkers could re-take the study and do it properly) who were identified as cheaters, and whose scores were not used in the average-score estimation. Each row is a rating.

The columns are:

 - stimulusId - the name of a webpage screenshot
 - isDuplicate - some webpages were rated twice by the same participant, for quality control	rating
 - isTraining - the same three webpages were always shown first, for participants' rating criteria calibration
 - dimension - the id of a measured dimension, see the list above
 - sessionId - a unique session ID; should be used to match ratings with demographics data

---

## Demographic data

in tab-separated datafiles contains:

 - sessionId - a unique sessionId
 - Gender/Education/Age/Occupation - hopefully, self-explanatory
 - WebUse - Self-reported number of hours a day browsing the Web
 - BankFamiliarity/ShopFamiliarity/UnivFamiliarity - Self-rated familiarity with bank/shopping/university websites (1 to 7); present depending on a dataset/domain
 - Lang - Speaking English fluently or not
 - Country - Country of origin
 - OtherLang - Non-English languages that a participant reported to speak and browse the Web in
 - wscreen/hscreen - Width/Height of browser inner window (space available to a webpage)
 - t2send - Time taken to fill out and submit demographics info
 - ifVpn - If VPN use was detected
 - domain - Dataset webpage type/domain, can be fash/home/banks/unis
 - devicePixelRatio - n of physical pixels in a browser's virtual pixel, for a particiapnt
 - susCheat - Participant was detected to cheat/not take the study seriously in this session
 - precision/recall/accuracy - Participant's performance on a recognition test (for quality control); the test included selecting the previously-shown webpages out of a list (10 shown and 10 new; 20 total to choose from)
 - uid - User ID; not unique per row, since a participant could re-do the study if they were detected to cheat
